-
Notifications
You must be signed in to change notification settings - Fork 274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[cherry-pick] [202205] Fix memory leak issue in ConfigDBConnector. #706
Conversation
Fix memory leak issue in ConfigDBConnector: [chassis] Too many open files error and unable to connect to redis socket error sonic-net/sonic-buildimage#10870 The reason of this issue is DBConnector::pubsub() will return a pointer, and following code call this method but never release the returned pointer: ``` void ConfigDBConnector_Native::db_connect(string db_name, bool wait_for_init, bool retry_on) { m_db_name = db_name; m_key_separator = m_table_name_separator = get_db_separator(db_name); SonicV2Connector_Native::connect(m_db_name, retry_on); if (wait_for_init) { auto& client = get_redis_client(m_db_name); auto pubsub = client.pubsub(); <== this pointer not delete later. ``` Also change DBConnector::pubsub() to deprecated for none SWIG scenario. Change DBConnector::pubsub() to return a smart pointer. Pass all test case. Run following code in python and validate there is no epoll and socket leak: ``` import gc from swsscommon import swsscommon config_db = swsscommon.ConfigDBConnector_Native() config_db.connect() config_db.connect() config_db.connect() gc.collect() ``` <!-- - Note we only backport fixes to a release branch, *not* features! - Please also provide a reason for the backporting below. - e.g. - [x] 202006 --> - [ ] 201811 - [ ] 201911 - [ ] 202006 - [ ] 202012 - [ ] 202106 - [x] 202111 Fix epoll and socket resurce leak issue: [chassis] Too many open files error and unable to connect to redis socket error sonic-net/sonic-buildimage#10870 Co-authored-by: liuh-80 <azureuser@liuh-dev-vm-02.5fg3zjdzj2xezlx1yazx5oxkzd.hx.internal.cloudapp.net>
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
PR been blocked by this UT, however the issue seems not related with the code change in this PR: 2022-11-10T15:51:38.7247858Z test_vlan.py::TestVlan::test_VlanMemberLinkDown FAILED [ 91%] 2022-11-09T04:43:35.3530022Z test_vlan.py::TestVlan::test_VlanMemberLinkDown FAILED [ 91%] |
My another PR on 202205 also have failed on same UT few times: |
Will create another test PR to validate the UT issue not caused by code change in this PR: |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
@qiluo-msft , could you please help do a force merge? And I confirm the UT issue not caused by change in this PR with another PR which change nothing: Validation also failed on same UT |
The UT issue caused by a UT added by this PR: sonic-net/sonic-swss#2469 Seems the UT depends on a kernel patch which not ready on 202205: |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
Fix memory leak issue in ConfigDBConnector:
[chassis] Too many open files error and unable to connect to redis socket error sonic-net/sonic-buildimage#10870
The reason of this issue is DBConnector::pubsub() will return a pointer, and following code call this method but never release the returned pointer:
Also change DBConnector::pubsub() to deprecated for none SWIG scenario.
Change DBConnector::pubsub() to return a smart pointer.
Pass all test case.
Run following code in python and validate there is no epoll and socket leak:
Fix epoll and socket resurce leak issue:
[chassis] Too many open files error and unable to connect to redis socket error sonic-net/sonic-buildimage#10870
Co-authored-by: liuh-80 azureuser@liuh-dev-vm-02.5fg3zjdzj2xezlx1yazx5oxkzd.hx.internal.cloudapp.net